Search CORE

5 research outputs found

Optimal Map Reduce Job Capacity Allocation in Cloud Systems.

Author: A. M. Rizzi
D. Ardagna
M. Ciavotta
M. Malekimajd
M. Passacantando
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

We are entering a Big Data world. Many sectors of our economy are now guided by data-driven decision processes. Big Data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to provide performance for MapReduce jobs and minimize cloud resource costs. The contribution of this paper is twofold: (i) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters, (ii) we formulate a linear programming model able to minimize cloud resources costs and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees. Simulation results show how the execution time of MapReduce jobs falls within 14% of our upper bound on average. Moreover, numerical analyses demonstrate that our method is able to determine the global optimal solution of the linear problem for systems including up to 1,000 user classes in less than 0.5 seconds

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio della Ricerca - Università di Pisa

An optimization framework for the capacity allocation and admission control of MapReduce jobs in cloud systems

Author: Ardagna D.
Ciavotta M.
Gianniti E.
Malekimajd M.
Passacantando M.
Rizzi A. M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Nowadays, we live in a Big Data world and many sectors of our economy are guided by data-driven decision processes. Big Data and Business Intelligence applications are facilitated by the MapReduce programming model, while, at infrastructural layer, cloud computing provides flexible and cost-effective solutions to provide on-demand large clusters. Capacity allocation in such systems, meant as the problem of providing computational power to support concurrent MapReduce applications in a cost-effective fashion, represents a challenge of paramount importance. In this paper we lay the foundation for a solution implementing admission control and capacity allocation for MapReduce jobs with a priori deadline guarantees. In particular, shared Hadoop 2.x clusters supporting batch and/or interactive jobs are targeted. We formulate a linear programming model able to minimize cloud resources costs and rejection penalties for the execution of jobs belonging to multiple classes with deadline guarantees. Scalability analyses demonstrated that the proposed method is able to determine the global optimal solution of the linear problem for systems including up to 10,000 classes in less than 1 s

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio della Ricerca - Università di Pisa

D-SPACE4Cloud: A Design Tool for Big Data Applications

Author: A Aleti
A Castiglione
A Verma
E Vianna
ED Lazowska
F Brosig
HV Jagadish
K Kambatla
KH Lee
M Bertoli
M Malekimajd
M Tribastone
MR Garey
S Becker
W Zhang
Z Zhang
Publication venue
Publication date: 01/01/2016
Field of study

The last years have seen a steep rise in data generation worldwide, with the development and widespread adoption of several software projects targeting the Big Data paradigm. Many companies currently engage in Big Data analytics as part of their core business activities, nonetheless there are no tools and techniques to support the design of the underlying hardware configuration backing such systems. In particular, the focus in this report is set on Cloud deployed clusters, which represent a cost-effective alternative to on premises installations. We propose a novel tool implementing a battery of optimization and prediction techniques integrated so as to efficiently assess several alternative resource configurations, in order to determine the minimum cost cluster deployment satisfying QoS constraints. Further, the experimental campaign conducted on real systems shows the validity and relevance of the proposed method

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Optimal Capacity Allocation for executing Map Reduce Jobs in Cloud Systems

Author: Ardagna D.
Ciavotta M.
Malekimajd M.
Movaghar A.
PASSACANTANDO MAURO
Rizzi A. M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Nowadays, analyzing large amount of data is of paramount importance for many companies. Big data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to providing performance for MapReduce jobs and minimize cloud resource cost. The contribution of this paper is twofold: (i) we formulate a linear programming model able to minimize cloud resources cost and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees, (ii) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters. Moreover, our solutions are validated by a large set of experiments. We demonstrate that our method is able to determine the global optimal solution for systems including up to 1000 user classes in less than 0.5 seconds. Moreover, the execution time of MapReduce jobs are within 19% of our upper bounds on average

Archivio della Ricerca - Università di Pisa

Optimal capacity allocation for executing mapreduce jobs in cloud systems

Author: Ardagna Danilo
Ciavotta Michele
Malekimajd Marzieh
Movaghar A.
Passacantando M.
Rizzi Alessandro Maria
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano